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Fermentation of pentose sugars 

Field of the invention 
5 The present invention relates to host cells transformed with a nucleic acid 

sequence encoding a eukaryotic xylose isomerase. The xylose isomerase is expressed in 
the host cell to confer the ability of isomerising xylose to xylulose. The host cell is used 
in a process for the production of ethanol and other fermentation products by 
fermentation of a pentose-containing medium. The present invention further relates to 
10 nucleic acid sequences encoding eukaryotic xylose isomerases. 



Background of the invention 
Large-scale consumption of the traditional, fossil fuels (petroleum-based fuels) in 
the last few decades has contributed to high levels of pollution. Moreover, the 

1 5 realisation that the world stock of petroleum is not boundless, combined with the 
growing environmental awareness, has stimulated new initiatives to investigate the 
feasibility of alternative fuels such as ethanol, which could realise a 60-90% decrease 
in C0 2 production. Although biomass-derived ethanol may be produced by 
fermentation of hexose sugars that are obtained from many different sources, so far, 

20 however, the substrates for industrial scale production or fuel alcohol are cane sugar 
and corn starch. The drawback of these substrates are the high costs. 

Expanding fuel ethanol production requires the ability to use lower-cost 
feedstocks. Presently, only lignocellulosic feedstock from plant biomass would be 
available in sufficient quantities to substitute the crops used for ethanol production. The 

25 major fermentable sugars from lignocellulosic materials are glucose and xylose, 

constituting respectively about 40% and 25% of lignocellulose. However, most yeasts 
that are capable of alcoholic fermentation, like Saccharomyces cerevisiae, are not 
capable of using xylose as a carbon source. Additionally, no organisms are known that 
can ferment xylose to ethanol with both a high ethanol yield and a high ethanol 

30 productivity. To enable the commercial production of ethanol from lignocellulose 

hydrolysate, an organism possessing both these properties would be required. Thus it is 
an object of the present invention to provide for a yeast that is capable of both alcoholic 
fermentation and of using xylose as a carbon source. 
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D-xylose is metabolisable by numerous microorganisms such as enteric bacteria, 
some yeasts and fungi. In most xylose-utilising bacteria, xylose is directly isomerised 
to D-xylulose by xylose (glucose) isomerase (XI). Filamentous fungi and yeasts, are 
however not capable of this one-step isomerisation and first reduce xylose to xylitol by 
5 the action of xylose reductase (XR) after which the xylitol is converted to xylulose by 
xylitol dehydrogenase (XDH). The first step requires NAD(P)H as a co-factor whereas 
the second step requires NAD + . The xylulose that is produced subsequently enters the 
pentose phosphate pathway (PPP) after it is phosphorylated by xylulose kinase (XK). 
Anaerobic fermentation of xylose to ethanol is not possible in organisms with a strictly 

1 0 NADPH dependent xylose reductase (XR). This is because xylitol dehydrogenase 
(XDH) is strictly NAD + dependent resulting in a redox imbalance (i.e., NAD + 
depletion). To solve the redox imbalance under anaerobic conditions, the organism 
produces by-products such as glycerol and xylitol. Similarly, aerobic production of /3- 
lactams on xylose is also negatively influenced as compared to /3-lactam production on 

15 glucose. A likely cause for these low yields again are a relatively high demand of 
reducing equivalents in the form of NADPH in this route, compared to the use of 
glucose (W.M. van Gulik et al. Biotechnol. Bioeng. Vol. 68, No. 6, June 20, 2000). 

Over the years many attempts have been made to introduce xylose metabolism in 
S. cerevisiae and similar yeasts, as reviewed in Zaldivar et al. (2001, Appl. Microbiol. 

20 Biotechnol. 56: 17-34). One approach concerns the expression of at least genes 

encoding a xylose (aldose) reductase and a xylitol dehydrogenase, e.g. the XYL1 and 
XYL2 of Pichia stipitis/mS. cerevisiae (US 5,866,382; WO 95/13362; and WO 
97/42307). Although this approach enables growth of S. cerevisiae on xylose, it 
generally suffers from a low ethanol productivity and/or yield as well as a high xylitol 

25 production, mainly as a result of the redox imbalance between XR and XDH. 

The expression of a XI in S. cerevisiae or related yeast or in filamentous fungi 
would circumvent the redox imbalance and consequent xylitol production and 
excretion. Xylose isomerase genes from several bacteria have been inserted in S. 
cerevisiae, however, expression of mesophilic prokaryotic XIs in 5. cerevisiae did not 

30 lead to active XI (Amore and Hollenberg, 1989, Nucleic Acids Res. 17: 7515; Amore et 
al., 1989, Appl. Microbiol. Biotechnol. 30: 351-357; Chan et al., 1986, Biotechnol. Lett 
8: 231-234; Chan et al., 1989, Appl. Microbiol. Biotechnol. 31: 524-528; Ho et al., 
1983, Fed. Proc. Fed. Am. Soc. Exp. Biol. 42: 2167; Hollenberg, 1987, EBC- 
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Symposium on Brewer's Yeast, Helsinki (Finland), 24-25 Nov 1986; Sarthy et al., 
1987, Appl. Environ. Microbiol. 53: 1996-2000; Ueng et al., 1985, Biotechnol. Lett. 7: 
153-158). Nevertheless, two XIs from thermophilic bacteria expressed in S. cerevisiae 
showed a specific activity of 1 jLtmol per minute per mg" 1 at 85°C (Bao et al., 1999, 

5 Weishengwu-Xuebao 39: 49-54; Walfridson et al., 1996, Appl. Environ. Microbiol. 61: 
4184-4190). However, at physiological temperature for S. cerevisiae (20-35°C) only a 
few percent of this activity is left, which is not sufficient for efficient alcoholic 
fermentation from xylose. Thus, there is still a need for nucleic acids encoding an XI 
that can be expressed in yeasts to provide sufficient XI activity under physiological 

10 conditions to allow for the use of xylose as carbon source. 

Description of the invention 

Definitions 
Xvlose isomerase 

15 The enzyme "xylose isomerase" (EC 5.3.1.5) is herein defined as an enzyme that 

catalyses the direct isomerisation of D-xylose into D-xylulose and vice versa. The 
enzyme is also known as a D-xylose ketoisomerase. Some xylose isomerases are also 
capable of catalysing the conversion between D-glucose and D-fructose and are 
therefore sometimes referred to as glucose isomerase. Xylose isomerases require 

20 magnesium as cofactor. Xylose isomerases of the invention may be further defined by 
their amino acid sequence as herein described below. Likewise xylose isomerases may 
be defined by the nucleotide sequences encoding the enzyme as well as by nucleotide 
sequences hybridising to a reference nucleotide sequence encoding a xylose isomerase 
as herein described below. 

25 A unit (U) of xylose isomerase activity is herein defined as the amount of enzyme 

producing 1 nmol of xylulose per minute, in a reaction mixture containing 50 mM 
phosphate buffer (pH 7.0), 10 mM xylose and 10 mM MgCl 2 , at 37°C. Xylulose formed 
was determined by the method of Dische and Borenfreund (1951, J. Biol. Chem. 192: 
583-587) or by HPLC as described in the Examples. 

30 Sequence identity and similarity 

Sequence identity is herein defined as a relationship between two or more amino 
acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) 
sequences, as determined by comparing the sequences. In the art, "identity" also means 
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the degree of sequence relatedness between amino acid or nucleic acid sequences, as 
the case may be, as determined by the match between strings of such sequences. 
"Similarity 11 between two amino acid sequences is determined by comparing the amino 
acid sequence and its conserved amino acid substitutes of one polypeptide to the 
5 sequence of a second polypeptide. "Identity 1 ' and "similarity" can be readily calculated 
by known methods, including but not limited to those described in (Computational 
Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; 
Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, 
New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and 

10 Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular 
Biology, von Heine, G., Academic Press, 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux, L, eds., M Stockton Press, New York, 1991; and Carillo, 
H., and Lipman, D., SIAM J. Applied Math., 48:1073 (1988). 

Preferred methods to determine identity are designed to give the largest match 

15 between the sequences tested. Methods to determine identity and similarity are codified 
in publicly available computer programs. Preferred computer program methods to 
determine identity and similarity between two sequences include e.g. the GCG program 
package (Devereux, J., et al., Nucleic Acids Research 12 (1):387 (1984)), BestFit, 
BLASTP, BLASTN, and FASTA (Altschul, S. R et al., J. Mol. Biol. 215:403-410 

20 (1990). The BLAST X program is publicly available from NCBI and other sources 

(BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, MD 20894; Altschul, 
S., et al., J. Mol. Biol. 215:403-410 (1990). The well-known Smith Waterman 
algorithm may also be used to determine identity. 

Preferred parameters for polypeptide sequence comparison include the following: 

25 Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970); Comparison 
matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. Sci. USA. 
89:10915-10919 (1992); Gap Penalty: 12; and Gap Length Penalty: 4. A program 
useful with these parameters is publicly available as the "Ogap" program from Genetics 
Computer Group, located in Madison, WI. The aforementioned parameters are the 

30 default parameters for amino acid comparisons (along with no penalty for end gaps). 
Preferred parameters for nucleic acid comparison include the following: 
Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970); Comparison 
matrix: matches=+10, mismatch=0; Gap Penalty: 50; Gap Length Penalty: 3. Available 
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as the Gap program from Genetics Computer Group, located in Madison, Wis. Given 
above are the default parameters for nucleic acid comparisons. 
Optionally, in detennining the degree of amino acid similarity, the skilled person may 
also take into account so-called "conservative" amino acid substitutions, as will be clear 
to the skilled person. Conservative amino acid substitutions refer to the 
interchangeability of residues having similar side chains. For example, a group of 
amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and 
isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and 
threonine; a group of amino acids having amide-containing side chains is asparagine 
and glutamine; a group of amino acids having aromatic side chains is phenylalanine, 
tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, 
arginine, and histidine; and a group of amino acids having sulphur-containing side 
chains is cysteine and methionine. Preferred conservative amino acids substitution 
groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysme-arginine, alanine- 
valine, and asparagine-glutamine. Substitutional variants of the amino acid sequence 
disclosed herein are those in which at least one residue in the disclosed sequences has 
been removed and a different residue inserted in its place. Preferably, the amino acid 
change is conservative. Preferred conservative substitutions for each of the naturally 
occurring amino acids are as follows: Ala to ser; Arg to lys; Asn to gin or his; Asp to 
glu; Cys to ser or ala; Gin to asn; Glu to asp; Gly to pro; His to asn or gin; lie to leu or 
val; Leu to ile or val; Lys to arg; gin or glu; Met to leu or ile; Phe to met, leu or tyr; Ser 
to thr; Thr to ser; Trp to tyr; Tyr to trp or phe; and, Val to ile or leu. 
Hybridising nucleic acid sequences 

Nucleotide sequences encoding xylose isomerases or xylulose kinases of the 
invention may also be denned by their capability to hybridise with the nucleotide 
sequences of SEQ ID NO. 2 or SEQ ID NO. 4, respectively, under moderate, or 
preferably under stringent hybridisation conditions. Stringent hybridisation conditions 
are herein defined as conditions that allow a nucleic acid sequence of at least about 25, 
preferably about 50 nucleotides, 75 or 100 and most preferably of about 200 or more 
nucleotides, to hybridise at a temperature of about 65°C in a solution comprising about 
1 M salt, preferably 6 x SSC or any other solution having a comparable ionic strength, 
and washing at 65°C in a solution comprising about 0.1 M salt, or less, preferably 0.2 x 
SSC or any other solution having a comparable ionic strength. Preferably, the 
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hybridisation is performed overnight, i.e. at least for 10 hours and preferably washing is 
performed for at least one hour with at least two changes of the washing solution. 
These conditions will usually allow the specific hybridisation of sequences having 
about 90% or more sequence identity. 

Moderate conditions are herein defined as conditions that allow a nucleic acid 
sequences of at least 50 nucleotides, preferably of about 200 or more nucleotides, to 
hybridise at a temperature of about 45 °C in a solution comprising about 1 M salt, 
preferably 6 x SSC or any other solution having a comparable ionic strength, and 
washing at room temperature in a solution comprising about 1 M salt, preferably 6 x 
SSC or any other solution having a comparable ionic strength. Preferably, the 
hybridisation is performed overnight, i.e. at least for 10 hours, and preferably washing 
is performed for at least one hour with at least two changes of the washing solution. 
These conditions will usually allow the specific hybridisation of sequences having up 
to 50% sequence identity. The person skilled in the art will be able to modify these 
hybridisation conditions in order to specifically identify sequences varying in identity 
between 50% and 90%. 
Qperablv linked 

As used herein, the term "operably linked" refers to a linkage of polynucleotide 
elements in a functional relationship. A nucleic acid is "operably linked" when it is 
placed into a functional relationship with another nucleic acid sequence. For instance, a 
promoter or enhancer is operably linked to a coding sequence if it affects the 
transcription of the coding sequence. Operably linked means that the DNA sequences 
being linked are typically contiguous and, where necessary to join two protein coding 
regions, contiguous and in reading frame. 
Promoter 

As used herein, the term "promoter" refers to a nucleic acid fragment that 
functions to control the transcription of one or more genes, located upstream with 
respect to the direction of transcription of the transcription initiation site of the gene, 
and is structurally identified by the presence of a binding site for DNA-dependent RNA 
polymerase, transcription initiation sites and any other DNA sequences, including, but 
not limited to transcription factor binding sites, repressor and activator protein binding 
sites, and any other sequences of nucleotides known to one of skill in the art to act 
directly or indirectly to regulate the amount of transcription from the promoter. A 
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"constitutive" promoter is a promoter that is active under most environmental and 
developmental conditions. An "inducible" promoter is a promoter that is active under 
environmental or developmental regulation. 

5 Detailed description of the invention 

In a first aspect the present invention relates to a transformed host cell that has 
the ability of isomerising xylose to xylulose. The ability of isomerising xylose to 
xylulose is conferred to the host cell by transformation of the host cell with a nucleic 
acid construct comprising a nucleotide sequence encoding a xylose isomerase. The 

10 transformed host cell's ability to isomerise xylose into xylulose is the direct 

isomerisation of xylose to xylulose. This is understood to mean that xylose isomerised 
into xylulose in a single reaction catalysed by a xylose isomerase, as opposed to the 
two step conversion of xylose into xylulose via a xylitol intermediate as catalysed by 
xylose reductase and xylitol dehydrogenase, respectively. 

15 The nucleotide sequence encodes a xylose isomerase that is preferably expressed 

in active form in the transformed host cell. Thus, expression of the nucleotide sequence 
in the host cell produces a xylose isomerase with a specific activity of at least 10 U 
xylose isomerase activity per mg protein at 25°C, preferably at least 20, 25, 30, 50, 100, 
200 or 300 U per mg at 25°C. The specific activity of the xylose isomerase expressed in 

20 the transformed host cell is herein defined as the amount of xylose isomerase activity 
units per mg protein of cell free lysate of the host cell, e.g. a yeast cell free lysate. 
Determination of the xylose isomerase activity, amount of protein and preparation of 
the cell free lysate are as described in Example 1. Alternatively, the specific activity 
may be determined as indicated in Example 4. Accordingly, expression of the 

25 nucleotide sequence in the host cell produces a xylose isomerase with a specific activity 
of at least 50 U xylose isomerase activity per mg protein at 30°C, preferably at least 
100, 200, 500, or 750 U per mg at 30°C. 

Preferably, expression of the nucleotide sequence in the host cell produces a 
xylose isomerase with a Km for xylose that is less than 50, 40, 30 or 25 mM, more 

30 preferably, the K m for xylose is about 20 mM or less. 

A nucleotide sequence encoding the xylose isomerase may be selected from the 
group consisting of: 
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(a) nucleotide sequences encoding a polypeptide comprising an amino acid sequence 
that has at least 40, 45, 49, 50, 53, 55, 60, 70, 80, 90, 95, 97, 98, or 99% sequence 
identity with the amino acid sequence of SBQ ID NO. 1; 

(b) nucleotide sequences comprising a nucleotide sequence that has at least 40, 50, 55, 
5 56, 57, 60, 70, 80, 90, 95, 97, 98, or 99% sequence identity with the nucleotide 

sequence of SEQ ID NO. 2; 

(c) nucleotide sequences the complementary strand of which hybridises to a nucleic 
acid molecule sequence of (a) or (b); 

(d) nucleotide sequences the sequence of which differs from the sequence of a nucleic 
1 0 acid molecule of (c) due to the degeneracy of the genetic code. 

The nucleotide sequence preferably encodes a eukaryotic xylose isomerase, i.e. a 
xylose isomerase with an amino acid sequence that is identical to that of a xylose 
isomerase that naturally occurs in an eukaryotic organism. Expression of a eukaryotic 
xylose isomerase increases the likelihood that the xylose isomerase is expressed in 

15 active form in a eukaryotic host cell such as yeast, as opposed to the mesophilic 
prokaryotic xylose isomerases. More preferably the nucleotide sequence encodes a 
plant xylose isomerase (e.g. from Hordeum vulgare) or a fungal xylose isomerase (e.g. 
from a Basidiomycete). Most preferably, however, the nucleotide sequence encodes a 
xylose isomerase from an anaerobic fungus, to further increase the likelihood of 

20 expression in enzymatically active form in a eukaryotic host cell, particularly in yeast. 
Most preferred are nucleotide sequences encoding a xylose isomerase from an 
anaerobic fungus that belongs to the families Neocallimastix, Caecomyces, Piromyces, 
Orpinomyces, or Ruminomyces. 

A host cell for transformation with a nucleotide sequence encoding a xylose 

25 isomerase preferably is a host capable of active or passive xylose transport into the cell. 
The host cell preferably contains active glycolysis, the pentose phosphate pathway and 
preferably contains xylulose kinase activity so that the xylulose isomerised from xylose 
may be metabolised to pyruvate. The host further preferably contains enzymes for 
conversion of pyruvate to a desired fermentation product such as ethanol, ethylene or 

30 lactic acid. A preferred host cell is a host cell that is naturally capable of alcoholic 
fermentation, preferably, anaerobic alcoholic fermentation. The host cell further 
preferably has a high tolerance to ethanol and organic acids like lactic acid, acetic acid 
or formic acid and sugar degradation products such as furfural and hydroxy- 
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methylfurfural. Any of these characteristics or activities of the host cell may be 
naturally present in the host cell or may be introduced or modified by genetic 
modification. A suitable host cell is a microorganism like a bacterium or a fungus, 
however, most suitable as host cell are yeasts or filamentous fungi. 

Yeasts are herein defined as eukaryotic microorganisms and include all species of 
the subdivision Eumycotina (Alexopoulos, C. J., 1962, In: Introductory Mycology, 
John Wiley & Sons, Inc., New York) that predominantly grow in unicellular form. 
Yeasts may either grow by budding of a unicellular thallus or may grow by fission of 
the organism. Preferred yeasts as host cells belong to the genera Saccharomyces, 
Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Kloeckera, 
Schwanniomyces, and Yarrowia. Preferably the yeast is capable of anaerobic 
fermentation, more preferably anaerobic alcoholic fermentation. 

Filamentous fungi are herein defined as eukaryotic microorganisms that include all 
filamentous forms of the subdivision Eumycotina. These fungi are characterized by a 
vegetative mycelium composed of chitin, cellulose, and other complex polysaccharides. 
The filamentous fungi of the present invention are morphologically, physiologically, 
and genetically distinct from yeasts. Vegetative growth by filamentous fungi is by 
hyphal elongation and carbon catabolism of most filamentous fungi is obligately 
aerobic. Preferred filamentous fungi as host cells belong to the genera Aspergillus, 
Trichoderma, Hwnicola, Acremonium, Fusarium, and Penicillium. 

Over the years suggestions have been made for the introduction of various 
organisms for the production of bio-ethanol from crop sugars. In practice, however, all 
major bio-ethanol production processes have continued to use the yeasts of the genus 
Saccharomyces as ethanol producer. This is due to the many attractive features of 
Saccharomyces species for industrial processes, i.e., a high acid-, ethanol- and osmo- 
tolerance, capability of anaerobic growth, and of course its high alcoholic fermentative 
capacity. Preferred yeast species as host cells include S. cerevisiae 9 S. bulderi, S. 
barnetti, S. exiguus, S. uvarum, S. diastaticus, K. lactis, K. marxianus, K.fragilis. 

The host cell is transformed with a nucleic acid construct as further defined 
below and may comprise a single but preferably comprises multiple copies of the 
nucleic acid construct. The nucleic acid construct may be maintained episomally and 
thus comprise a sequence for autonomous replication, such as an ARS sequence. 
Suitable episomal nucleic acid constructs may e.g. be based on the yeast 2fi or pKDl 
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(Fleer et al., 1991, Biotechnology 9:968-975) plasmids. Preferably, however, the 
nucleic acid construct is integrated in one or more copies into the genome of the host 
cell. Integration into the host cell's genome may occur at random by illegitimate 
recombination but preferably nucleic acid construct is integrated into the host cell's 
genome by homologous recombination as is well known in the art of fungal molecular 
genetics (see e.g. WO 90/14423, EP-A-0 481 008, EP-A-0 635 574 and US 6,265,186). 

In a preferred transformed host cell according to the invention, the nucleic acid 
construct confers to the host cell the ability to grow on xylose as carbon source, 
preferably as sole carbon source, and preferably under anaerobic conditions, whereby 
preferably the transformed host produce essentially no xylitol, e.g. the xylitol produced 
is below the detection limit or e.g. less than 5, 2, 1% of the carbon consumed on a 
molar basis. The transformed host cell has the ability to grow on xylose as sole carbon 
source at a rate of at least 0.01, 0.02, 0.05, 0.1 or 0.2 h" 1 . The transformed host cell of 
the invention thus expresses a xylose isomerase at a specific activity level defined 
above. 

A host cell may comprises further genetic modifications that result in one or more 
of the characteristics selected from the group consisting of (a) increase transport of 
xylose into the host cell; (b) increased xylulose kinase activity; (c) increased flux of the 
pentose phosphate pathway; (d) decreased sensitivity to catabolite respression; (e) 
increased tolerance to ethanol, osmolality or organic acids; and, (f) reduced production 
of by-products. By-products are understood to mean carbon-containing molecules other 
than the desired fermentation product and include e.g. xylitol, glycerol and/or acetic 
acid. Such genetic modifications may be introduced by classical mutagenesis and 
screening and/or selection for the desired mutant. Alternatively, the genetic 
modifications may consist of overexpression of endogenous genes and/or expression of 
a heterologous genes and/or the inactivation of endogenous genes. The genes are 
preferably chosen form genes encoding a hexose or pentose transporter; a xylulose 
kinase such as the xylulose kinase genes from S. cerevisae (XKS1 Deng and Ho, 1990, 
Appl. Biochem. Biotechnol. 24-25 : 193-199) orPiromyces (xylB, i.e. SEQ ED NO. 4); 
an enzyme from the pentose phosphate pathway such as a transaldolase (TALI) or a 
transketolase (TKL1) (see e.g. Meinander et al., 1995, Pharmacol.Toxicol. Suppl.2: 45) 
glycolytic enzymes, ethanologenic enzymes such as alcolhol dehydrogenases. Preferred 
endogenous genes for inactivation include a hexose kinase gene e.g. the S. cerevisae 
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HXK2 gene (see Diderich et al., 2001, Appl. Environ. Microbiol. 67: 1587-1593); the S. 
cerevisae MIG1 or MIG2 genes; (unspecific) aldose reductase genes such as the S. 
cerevisae GRE3 gene (Traff et al., 2001, Appl. Environm. Microbiol. 67: 5668-5674); 
genes for enzymes involved in glycerol metabolism such as the S. cerevisae glycerol- 

5 phosphate dehydrogenase 1 and/or 2 genes; or (hybridising) homologues of the genes 
in other host species. Further preferred modifications of host cells for xylose 
fermentation are reviewed in Zaldivar et al. (2001, supra). 

In another aspect the invention relates to a transformed host cell for the 
production of fermentation products other than ethanol. Such non-ethanolic 

1 0 fermentation products include in principle any bulk or fine chemical that is producible 
by eukaryotic microorganism such as a yeast or a filamentous fungus. Such 
fermentation products include e.g. lactic acid, acetic acid, succinic acid, amino acids, 
1,3-propane-diol, ethylene, glycerol, /3-lactam antibiotics and cephalosporins. 

Transformation of host cells with the nucleic acid constructs of the invention and 

15 additional genetic modification of host cells, preferably yeasts, as described above may 
be carried out by methods well known in the art. Such methods are e.g. known from 
standard handbooks, such as Sambrook and Russel (2001) "Molecular Cloning: A 
Laboratory Manual (3 rd edition), Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, or F. Ausubel et al, eds., "Current protocols in molecular biology", 

20 Green Publishing and Wiley Interscience, New York (1987). Methods for 

transformation and genetic modification of fungal host cells are known from e.g. EP-A- 
0 635 574, WO 98/46772, WO 99/60102 and WO 00/37671. 

In another aspect the invention relates to a nucleic acid construct comprising a 
nucleotide sequence encoding a xylose isomerase as defined above and used for 

25 transformation of a host cell as defined above. In the nucleic acid construct, the 

nucleotide sequence encoding the xylose isomerase preferably is operably linked to a 
promoter for control and initiation of transcription of the nucleotide sequence in a host 
cell as defined below. The promoter preferably is capable of causing sufficient 
expression of the xylose isomerase in the host cell, to confer to the host cell the ability 

30 to isomerise xylose into xylulose. Preferably, the promoter causes a specific xylose 

isomerase activity in the host cell as defined above. Promoters useful in the nucleic acid 
constructs of the invention include both constitutive and inducible natural promoters as 
well as engineered promoters. A preferred promoter for use in the present invention 
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will in addition be insensitive to catabolite (glucose) repression and/or will preferably 
not require xylose for induction. Promotors having these characteristics are widely 
available and known to the skilled person. Suitable examples of such promoters include 
e.g. yeast promoters from glycolytic genes, such as the yeast phosphofructokinase 
(PPK), triose phosphate isomerase (TPI), glyceraldehyde-3 -phosphate dehydrogenase 
(GPD, TDH3 or GAPDH), pyruvate kinase (PYK), phosphoglycerate kinase (PGK) 
promoters; more details about such promoters maybe found in (WO 93/03159). Other 
useful promoters are ribosomal protein encoding gene promoters, the lactase gene 
promoter (LAC4), alcohol dehydrogenase promoters (ADH1, ADH4, and the like), and 
the enolase promoter (ENO). Other promoters, both constitutive and inducible and 
enhancers or upstream activating sequences will be known to those of skill in the art. 
The promoters used in the nucleic acid constructs of the present invention may be 
modified, if desired, to affect their control characteristics. Preferably, the promoter used 
in the nucleic acid construct for expression of the xylose isomerase is homologous to 
the host cell in which the xylose isomerase is expressed. 

In the nucleic acid construct, the 3'-end of the nucleotide acid sequence encoding 
the xylose isomerase preferably is operably linked to a transcription terminator 
sequence. Preferably the terminator sequence is operable in a host cell of choice, such 
as e.g. the yeast species of choice. In any case the choice of the terminator is not 
critical, it may e.g. be from any yeast gene, although terminators may sometimes work 
if from a non-yeast, eukaryotic, gene. The transcription termination sequence further 
preferably comprises a polyadenylation signal. 

Optionally, a selectable marker may be present in the nucleic acid construct. As 
used herein, the term "marker" refers to a gene encoding a trait or a phenotype which 
permits the selection of, or the screening for, a host cell containing the marker. The 
marker gene may be an antibiotic resistance gene whereby the appropriate antibiotic 
can be used to select for transformed cells from among cells that are not transformed. 
Examples of suitable antibiotic resistance markers include e.g. dihydrofolate reductase, 
hygromycin-B-phosphotransferase, 3'-0-phosphotransferase II (kanamycin, neomycin 
and G418 resistance). Although the of antibiotic resistance markers may be most 
convenient for the transformation of polyploid host cells, preferably however, non- 
antibiotic resistance markers are used, such as auxotrophic markers (LJRA3, TRP1, 
LEU2) or the S. pombe TPI gene (described by Russell P R, 1985, Gene 40: 125-130). 
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In a preferred embodiment the host cells transformed with the nucleic acid constructs 
are marker gene free. Methods for constructing recombinant marker gene free 
microbial host cells are disclosed in EP-A-0 635 574 and are based on the use of 
bidirectional markers such as the A nidulans amdS (acetamidase) gene or the yeast 
5 URA3 and LYS2 genes. Alternatively, a screenable marker such as Green Fluorescent 
Protein, lacZ, luciferase, chloramphenicol acetyltransferase, beta-glucuronidase maybe 
incorporated into the nucleic acid constructs of the invention allowing to screen for 
transformed cells. 

Optional further elements that may be present in the nucleic acid constructs of the 

10 invention include, but are not limited to, one or more leader sequences, enhancers, 

integration factors, and/or reporter genes, intron sequences, centromers, telomers and/or 
matrix attachment (MAR) sequences. The nucleic acid constructs of the invention may 
further comprise a sequence for autonomous replication, such as an ARS sequence. 
Suitable episomal nucleic acid constructs may e.g. be based on the yeast 2/i or pKDl 

15 (Fleer et al, 1991, Biotechnology 9:968-975) plasmids. Alternatively the nucleic acid 
construct may comprise sequences for integration, preferably by homologous 
recombination. Such sequences may thus be sequences homologous to the target site 
for integration in the host cell's genome. The nucleic acid constructs of the invention 
can be provided in a manner known per se, which generally involves techniques such as 

20 restricting and linking nucleic acids/nucleic acid sequences, for which reference is 
made to the standard handbooks, such as Sambrook and Russel (2001) "Molecular 
Cloning: A Laboratory Manual (3 Td edition), Cold Spring Harbor Laboratory, Cold 
Spring Harbor Laboratory Press, or F. Ausubel et al, eds., "Current protocols in 
molecular biology", Green Publishing and Wiley Interscience, New York (1987). 

25 In another aspect the invention relates to a nucleic acid molecule comprising a 

nucleotide sequence that encodes a xylose isomerase. The nucleic acid molecule is 
preferably selected from the group consisting of: 

(a) nucleic acid molecules encoding a polypeptide comprising an amino acid sequence 
that has at least 50, 53, 54, 55, 60, 70, 80, 90, 95, 97, 98, or 99% sequence identity 

30 with the amino acid sequence of SEQ ID NO. 1 ; 

(b) nucleic acid molecules comprising a nucleotide sequence that has at least 50, 56, 
57, 58, 60, 70, 80, 90, 95, 97, 98, or 99% sequence identity with the nucleotide 
sequence of SEQ ID NO. 2; 
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(c) nucleic acid molecules the complementary strand of which hybridises to a nucleic 
acid molecule sequence of (a) or (b); 

(d) nucleic acid molecules the sequence of which differs from the sequence of a nucleic 
acid molecule of (c) due to the degeneracy of the genetic code. 

Alternatively, a nucleic acid molecule of (a) may encode a polypeptide 
comprising an amino acid sequence that has at least 67, 68, 69, 70, 80, 90, 95, 97, 98, 
or 99% sequence similarity with the amino acid sequence of SEQ ID NO. 1. A nucleic 
acid molecule of (c) preferably hybridises under moderate conditions, more preferably 
under stringent conditions as herein defined above. Preferably the nucleic acid 
molecule is from a eukaryote, more preferably from a eukaryotic microorganism such 
as a fungus, most preferably from an anaerobic fungus, such as e.g. that anaerobic fungi 
described above. 

Yet another aspect of the invention relates to a nucleic acid molecule comprising 
a nucleotide sequence that encodes a xylulose kinase, preferably a D-xylulose kinase. A 
D-xylulose kinase (EC 2.7.1.17; also referred to as a D-xylulokinase) is herein defined 
as an enzyme that catalyses the conversion of D-xylulose into xylulose-5 -phosphate. 
The nucleic acid molecule is preferably selected from the group consisting of: 

(a) nucleic acid molecules encoding a polypeptide comprising an amino acid sequence 
that has at least 45, 47, 48, 49, 50, 55, 60, 70, 80, 90, 95, 97, 98, or 99% sequence 
identity with the amino acid sequence of SEQ ID NO. 3; 

(b) nucleic acid molecules comprising a nucleotide sequence that has at least 30, 37, 
38, 39, 40, 50, 60, 70, 80, 90, 95, 97, 98, or 99% sequence identity with the 
nucleotide sequence of SEQ ID NO. 4; 

(c) nucleic acid molecules the complementary strand of which hybridises to a nucleic 
acid molecule sequence of (a) or (b); and, 

(d) nucleic acid molecules the sequence of which differs from the sequence of a nucleic 
acid molecule of (c) due to the degeneracy of the genetic code. 

Alternatively, a nucleic acid molecule of (a) may encode a polypeptide 
comprising an amino acid sequence that has at least 64, 65, 66, 70, 80, 90, 95, 97, 98, 
or 99% sequence similarity with the amino acid sequence of SEQ ID NO. 3. A nucleic 
acid molecule of (c) preferably hybridises under moderate conditions, more preferably 
under stringent conditions as herein defined above. Preferably the nucleic acid 
molecule is from a eukaryote, more preferably from a eukaryotic microorganism such 
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as a fungus, most preferably from an anaerobic fungus, such as e.g. that anaerobic fungi 
described above. 

hi a further aspect the invention relates to fermentation processes in which the 
transformed host cells of the invention are used for the fermentation of carbon source 
comprising a source of xylose, such as xylose. In addition to a source of xylose the 
carbon source in the fermentation medium may also comprise a source of glucose. The 
source of xylose or glucose may be xylose or glucose as such or may be any 
carbohydrate oligo- or polymer comprising xylose or glucose units, such as e.g. 
lignocellulose, xylans, cellulose, starch and the like. For release of xylose or glucose 
units from such carbohydrates, appropriate carbohydrases (such as xylanases, 
glucanases, amylases and the like) may be added to the fermentation medium or may be 
produced by the transformed host cell. In the latter case the transformed host cell may 
be genetically engineered to produce and excrete such carbohydrases. In a preferred 
process the transformed host cell ferments both the xylose and glucose, preferably 
simultaneously in which case preferably a transformed host cell is used which is 
insensitive to glucose repression to prevent diauxic growth. In addition to a source of 
xylose (and glucose) as carbon source, the fermentation medium will further comprise 
the appropriate ingredient required for growth of the transformed host cell. 
Compositions of fermentation media for growth of microorganisms such as yeasts are 
well known in the art. 

The fermentation process is a process for the production of a fermentation 
product such as ethanol, lactic acid, acetic acid, succinic acid, amino acids, 1,3- 
propane-diol, ethylene, glycerol, j3-lactam antibiotics such as Penicillin G or Penicillin 
V and fermentative derivatives thereof and cephalosporins. The fermentation process 
may be an aerobic or an anaerobic fermentation process. An anaerobic fermentation 
process is herein defined as a fermentation process run in the absence of oxygen or in 
which substantially no oxygen is consumed, e.g. less than 5 mmol/L/h, and wherein 
organic molecules serve as both electron donor and electron acceptors. In the absence 
of oxygen, NADH produced in glycolysis and biomass formation, cannot be oxidised 
by oxidative phosphorylation. To solve this problem many microorganisms use 
pyruvate or one of its derivatives as an electron and hydrogen acceptor thereby 
regenerating NAD + . Thus, in a preferred anaerobic fermentation process pyruvate is 



WO 03/062430 



PCT/NL03/00049 



16 

used as an electron (and hydrogen acceptor) and is reduced to fermentation products 
such as ethanol, lactic acid, 1,3-propanediol, ethylene, acetic acid or succinic acid. 

The fermentation process is preferably run at a temperature that is optimal for the 
transformed host cell. Thus, for most yeasts or fungal host cells, the fermentation 
process is performed at a temperature which is less than 38°C. For yeast or filamentous 
fungal host cells, the fermentation process is preferably performed at a temperature 
which is lower than 35, 33, 30 or 28°C and at a temperature which is higher than 20, 22, 
or25°C. 

A preferred process is a process for the production of ethanol, whereby the 
process comprises the steps of: (a) fermenting a medium containing a source of xylose 
with a transformed host cell as defined above, whereby the host cell ferments xylose to 
ethanol; and optionally, (b) recovery of the ethanol. The fermentation medium may also 
comprise a source of glucose that is also fermented to ethanol. In the process the 
volumetric ethanol productivity is preferably at least 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 5.0 or 
10.0 g ethanol per litre per hour. The ethanol yield on xylose and/or glucose in the 
process preferably is at least 50, 60, 70, 90, 95 or 98%. The ethanol yield is herein 
defined as a percentage of the theoretical yield, which, for glucose and xylose is 0.51 g. 
ethanol per g. glucose or xylose. 

In a further aspect the invention relates to a process for producing a fermentation 
product selected from the group consisting of lactic acid, acetic acid, succinic acid, 
amino acids, 1,3-propane-diol, ethylene, glycerol, 0-lactam antibiotics and 
cephalosporins. The process preferably comprises the steps of (a) fermenting a medium 
containing a source of xylose with a transformed host cell as defined herein above, 
whereby the host cell ferments xylose to the fermentation product, and optionally, (b) 
recovery of the fermentation product. In a preferred process, the medium also contains 
a source of glucose. 

Description of the figures 
Figure 1. Growth curves of S. cerevisiae transformant grown on medium containing 25 
mM galactose and 100 mM xylose as carbon source. Transformant pYes contains a 
yeast expression vector without insertion. Transformants 14.3, 16.2.1 and 16.2.2 are 
transformed with the pYES vector containing the Piromyces sp. E2 xylose isomerase 
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coding sequence. 
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Examples 
Example 1 

Cloning of Piromvces xvlanase isomerase and xylulose kin ase cDNAs 
Organism and growth conditions 

The anaerobic fungus Piromyces sp. E2 (ATCC 76762), isolated from faeces of 
an Indian elephant, was grown anaerobically under N 2 /C0 2 (80 %/20 %) at 39°C in 
medium M2 supplemented with various carbon sources (24). Carbon sources used were 
Avicel (microcrystaline cellulose type PH 105, Serva, Germany), fructose or xylose (all 
0.5 %, w/v). After growth ceased, as judged by hydrogen production, the cells were 
harvested by centrifugation (15,000 x g, 4°C, 15 min;) or by filtration over nylon gauze 
(30 fim pore size). 
Preparation of cell-free extract 

The fungal cells were washed with deionized water to remove medium 
components. Cell-free extracts were prepared by freezing the cells in liquid nitrogen 
and subsequent grinding with glass beads (0. 1 0-0. 1 1 mm diameter) in a mortar. 
Tris/HCl buffer (100 mM, pH 7.0) was added to the powder (1:1, w/v) and after 
thawing for 15 min the suspension was centrifuged (18,000 x g, 4°C, 15 min). The 
clear supernatant was used as a source of intracellular enzymes. 
Enzvme assays 

Xylose isomerase activity was assayed at 37°C in a reaction mixture containing 
50 mM phosphate buffer (pH 7.0), 10 mM xylose, 10 mM MgCl 2 and a suitable amount 
of cell-free extract. The amount of xylulose formed was determined by the cysteine- 
carbazole method (9). Xylulose kinase and xylose reductase activities were assayed as 
described by Witteveen et al. (28). One unit of activity is defined as the amount of 
enzyme producing 1 nmol of xylulose per min under the assay conditions. Xylulose 
formed was determined by the method of Dische and Borenfreund (Dische and 
Borenfreund, 1951, J. Biol. Chem. 192: 583-587) or by HPLC using a Biorad HPX- 
87N Column operated at 80°C and eluated at 0.6 ml/min using 0.01 M Na 2 HP0 4 as the 
eluens. Xylose and xylulose were detected by a Refractive Index detector at an internal 
temperature of 60°C. 



WO 03/062430 



PCT/NL03/00049 



19 

Specific activity is expressed as units per mg protein. Protein was determined 
with the Bio-Rad protein reagent (Bio-Rad Laboratories, Richmond, CA, USA) with 
bovine y-globulin as a standard. 

Random sequencing of a Piromvces sp. E2 cD NA library 

The cDNA library constructed in the vector lambda ZAPII as described 
previously (2) was used. An aliquot of this library was converted to pBluescript SK- 
clones by mass excission with the ExAssist helper phage (Stratagene, La Jolla, CA, 
USA). Randomly selected clones were sequenced with the Ml 3 reverse primer to 
obtain 5 f part sequences. Uncomplete cDNAs were used to synthesize probes which 
were used to rescreen the library. To obtain full length sequences subclones were 
generated in pUC18. Sequencing was performed with the ABI prism 3 10 automated 
sequencer with the dRhodamine terminator cycle sequencing ready reaction DNA 
sequencing kit (Perkin-Elmer Applied Biosystems). 
Results 

Randomly selected clones from a cDNA library of the anaerobic fungus 
Piromyces sp. E2 were sequenced and this resulted in two clones (pH97 and pAK44) 
which sequences showed high homology to xylose isomerase and D-xylulokinase 
genes, respectively. The clones were analysed in detail. 

Clone pH97 did not contain a complete ORF and therefore the cDNA library was 
rescreened with a probe designed on the basis of sequence data from clone pH97. This 
resulted in a clone pR3 with an insert of 1669 bp. An ORF encoding a protein of 437 
amino acids with high similarity to xylose isomerases could be identified. Although the 
5' untranslated region comprises only 4 bp, the presumed starting methionine residue 
fitted well into an alignment of known xylose isomerase sequences. The 3 f untranslated 
region was 351 bp long and had a high AT content, which is typical for anaerobic 
fimgi. The ORF contained the amino acids shown to be important for interaction with 
the substrate (catalytic triad His 102, Asp 105, Asp 340 and Lys 235) and binding of 
magnesium (Glu 232) (14, 26). Further, the two signature patterns (residues 185-194 
and 230-237) developed for xylose isomerases (20) were present. The Piromyces sp. E2 
xylose isomerase (XylA) shows the highest homology to the enzymes of Haemophilus 
influenza (52 % identity, 68 % similarity) and Hordeum vulgare (49 % identity, 67 % 
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similarity). The polypeptide deduced from the cDNA sequence corresponds to a 
molecular mass of 49,395 Da and has a calculated pi of 5.2. 

The second clone, pAK44, had an insert of 2041 bp and contained a complete 
ORF encoding a protein of 494 amino acids with a molecular weight of 53,158 Da and 

5 a pi of 5.0. The first methionine is preceeded by a 1 1 1 bp 5' untranslated region, while 
the 3' untranslated region comprised 445 bp. Both regions are AT-rich. BLAST and 
FASTA searches revealed high similarity to xylulokinases. The two phosphate 
consensus regions defined by Rodriguez-Pena et al. (22) were found at positions 6-23 
and 254-270 as shown in a partial alignment. Moreover the signatures for this family of 

10 carbohydrate kinase as described in the Prosite database were identified (131-145 and 
35 1-372). The Piromyces sp. E2 xylulokinase (XylB) showed highest homology with 
the XylB protein of Haemophilus influenza (46 % identity, 64 % similarity). 

Example 2 
Construction of veast expression vectors 
Ex pression of xylose isomerase from Piromyces sp. E2 in S accharomvces 

cerevisiae 

cDNA from Piromyces sp. E2 was used in a PCR reaction wiUxpJu polymerase 
(Stratagene). The primers were designed using the sequences from the 5' and 3' ends of 
the xylose isomerase gene and also contained a Sfi I and a Xbal restriction site. The 
PCR product was cloned in the pPICZa vector (Invitrogen, Carlsbad, CA, USA). To 
obtain the xylose isomerase gene, the pPICZa vector was digested with EcoRI and 
Xbal. The digestion product was ligated-into the pYes2 vector (Invitrogen). The pYes2 
plasmid with the xylose isomerase gene was transformed into Saccharomyces 
cerevisiae (stam BJ1991, gift from Beth Jones, UvA). The genotype of this strain is: 
mata, leu2, trpl, ura 3-251, prbl-1 122 and pep4-3. 

Transformants were plated on SC plates (0.67% YNB medium + 0.05% L-Leu + 0.05% 
L-Trp + 2% glucose + 2% agarose). Untransformed cells can not grow on these plates. 
Induction 

Transformed Saccharomyces cerevisiae cells were grown on glucose medium at 25 °C 
for 72 h (raffinose can be used as an alternative for glucose). Cells were harvested and 
resuspended in SC medium with galactose instead of glucose. After 8 h of induction 
cells were harvested and lysed using glass beads (0.10-0.1 1 mm diameter) and 
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"breaking buffer" (50mM phosphate buffer + 5% glycerol + protease inhibitor). After 
lysis the mixture was centrifuged (18,000 x g, 4°C, 15 min). The clear supernatant was 
used to determine xylose isomerase activity using the method described above 
(Example 1). An activity of 10 U per mg protein was measured at 37°C. 

5 

Example 3 

Growth of transformed yeast strains on xylose 
Medium composition 

Saccharomyces cerevisiae strains were grown on SC-medium with the following 
10 composition: 0.67% (w/v) yeast nitrogen base; 0.01 % (w/v) L-tryptophan; 0.01% (w/v) 
L-leucine and either glucose, galactose or xylose, or a combination of these substrates 
(see below). For agar plates the medium was supplemented with 2% (w/v) 
bacteriological agar. 
Growth experiment 

1 5 Saccharomyces cerevisiae strain B J 1 991 (genotype: matct, leu2, trp 1 , ura 3-25 1 , prb 1 - 
1 122, pep4-3) transformed with pYes2 without insertion and three selected 
transformants (16.2.1; 16.2.2 and 14.3) containing pYes2 with the Piromyces sp. E2 
xylose isomerase gene were grown on SC-agar plates with 10 mM glucose as carbon 
source. When colonies were visible, single colonies were used to inoculate liquid SVC- 

20 medium with 100 mM xylose and 25 mM galactose as carbon sources. Growth was 
monitored by measuring the increase in optical density at 600 nm on a LKJ3 Ultrospec 
K spectrophotometer. 
Results 

The results of the growth experiments are compiled in Figure 1. The culture with the 
25 BJ1 99 1 strain transformed with pYes2 without insertion shows an increase in OD 6 oo up 
to 80 h. After this time a gradual decrease is observed. This is caused by aggregation of 
the yeast cells which is often observed at the end of growth. The cultures with the three 
transformants do not stop growing after 80 h and show a further increase up to at least 
150 h. 
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Exam ple 4 

Construction of a new, im proved, veast expression vector for constitutive expression of 

the Piromvces sp.E2 xvlose isomerase in Saccharomvce s cerevisiae. 
The pPICZa vector, containing the Piromyces sp. E2 gene coding for xylose isomerase, 
was used as a template for PCR with Vent R DNA polymerase (New England Biolabs). 
The primers were designed using the 5' and 3' sequences of the gene coding for xylose 
isomerase and included an EcoRI and an Spel site. Additionally the primers were 
designed to remove the Xbal site found in the pPICZa construct, replacing it with a 
stopcodon (TAA). The final product was designed to restore the orginal open reading 
frame, without the added aminoacids (his and c-Myc tags) found in the pPICZa 
construct. The PCR product was cut with EcoRI and Spel. The final product was 
cloned into a vector derived from pYES2 (Invitrogen). In this vector the GAL1 
promoter found in pYES2 was replaced by the TPI1 promoter in order to ensure 
constitutive expression of the xylose isomerase, thereby ehminating the need for 
galactose in the medium. The TPI1 promoter was cloned from a modified form of 
plasmid pYX012 (R&D systems). The promoter was cut out as a Nhel-EcoRI fragment. 
Both the TPI1 promoter and the PCR product of the gene coding for the xylose 
isomerase were ligated into pYES2 cut with Spel and Xbal. This plasmid was used to 
transform Saccharomyces cerevisiae strain CEN.PK1 13-5D (gift from Peter KStter, 
Frankfurt). The genotype of the strain is: MatA ura3-52. Transformants were selected 
on mineral medium plates (Verduyn et al.: Effect of benzoic acid on metabolic fluxes in 
yeasts: a continuous-culture study on the regulation of respiration and alcoholic 
fermentation.(1992) Yeast 8(7):501-17) with 2% glucose as the carbon source. 
Untransformed cells cannot grow on these plates. 

Transformants were grown on glucose/xylose mixtures in carbon-Umited chemostat 
cultures. Transformants grown under these conditions exhibit high xylose isomerase 
activities (800 units per mg at 30°C) according to a specific enzyme assay as developed 
by Kersters-Hildersson et al. (Kinetic characterization of D-xylose isomerases by 
enzymatic assays using D-sorbitol dehydrogenase. Enz. Microb. Technol. 9 (1987) 
145-148). The in vitro activity of xylose isomerase in the cell-free extracts of the 
transformed S.cerevisiae strain was dependent on bivalent cations (Mg2+ or Co2+) and 
a relatively low Km value for xylose of approximately 20 mM was measured. 
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SEQUENCE LISTING RsS^fl PGT/FT® O j J(JL 2004 

<110> Royal Nedalco B.V. 

5 <120> Fermentation of pentose sugars 

<130> piromyces xylose isomerase 

<140> BO 44829 
10 <141> 2001-12-31 

<160> 4 

<170> Patentln Ver. 2.1 

15 

<210> 1 
<211> 437 
<212> PRT 

<213> Piromyces sp. 

20 

<400> 1 

Met Ala Lys Glu Tyr Phe Pro Gin He Gin Lys He Lys Phe Glu Gly 
x A 5 10 15 

25 Lys Asp Ser Lys Asn Pro Leu Ala Phe His Tyr Tyr Asp Ala Glu Lys 
20 25 30 

Glu Val Met Gly Lys Lys Met Lys Asp Trp Leu Arg Phe Ala Met Ala 
35 40 45 

30 

Trp Trp His Thr Leu Cys Ala Glu Gly Ala Asp Gin Phe Gly Gly Gly 
50 55 60 

Thr Lys Ser Phe Pro Trp Asn Glu Gly Thr Asp Ala He Glu He Ala 
35 65 70 75 80 

Lys Gin Lys Val Asp Ala Gly Phe Glu He Met Gin Lys Leu Gly He 
85 90 95 

40 Pro Tyr Tyr Cys Phe His Asp Val Asp Leu Val Ser Glu Gly Asn Ser 
100 105 HO 

He Glu Glu Tyr Glu Ser Asn Leu Lys Ala Val Val Ala Tyr Leu Lys 
115 120 125 

45 

Glu Lys Gin Lys Glu Thr Gly He Lys Leu Leu Trp Ser Thr Ala Asn 
130 135 140 

Val Phe Gly His Lys Arg Tyr Met Asn Gly Ala Ser Thr Asn Pro Asp 
50 145 150 155 160 

Phe Asp Val Val Ala Arg Ala He Val Gin He Lys Asn Ala He Asp 
165 170 175 

55 Ala Gly He Glu Leu Gly Ala Glu Asn Tyr Val Phe Trp Gly Gly Arg 
180 185 190 

Glu Gly Tyr Met Ser Leu Leu Asn Thr Asp Gin Lys Arg Glu Lys Glu 
195 200 2 °5 

60 
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His Met Ala Thr Met Leu Thr Met Ala Arg Asp Tyr Ala Arg Ser Lys 
210 215 220 

Gly Phe Lys Gly Thr Phe Leu lie Glu Pro Lys Pro Met Glu Pro Thr 
5 225 ^ * 230 235 240 

Lys His Gin Tyr Asp Val Asp Thr Glu Thr Ala He Gly Phe Leu Lys 
245 250 255 

10 Ala His Asn Leu Asp Lys Asp Phe Lys Val Asn He Glu Val Asn His 
260 265 270 



15 



30 



45 



Ala Thr Leu Ala Gly His Thr Phe Glu His Glu Leu Ala Cys Ala Val 
275 280 285 

Asp Ala Gly Met Leu Gly Ser He Asp Ala Asn Arg Gly Asp Tyr Gin 
290 295 300 



Asn Gly Trp Asp Thr Asp Gin Phe Pro He Asp Gin Tyr Glu Leu Val 
20 305 " ~ 310 315 320 

Gin Ala Trp Met Glu He He Arg Gly Gly Gly Phe Val Thr Gly Gly 
325 330 335 

25 Thr Asn Phe Asp Ala Lys Thr Arg Arg Asn Ser Thr Asp Leu Glu Asp 
340 345 350 



He He He Ala His Val Ser Gly Met Asp Ala Met Ala Arg Ala Leu 
355 360 365 

Glu Asn Ala Ala Lys Leu Leu Gin Glu Ser Pro Tyr Thr Lys Met Lys 
370 375 380 



Lys Glu Arg Tyr Ala Ser Phe Asp Ser Gly He Gly Lys Asp Phe Glu 
35 385 390 395 400 

Asp Gly Lys Leu Thr Leu Glu Gin Val Tyr Glu Tyr Gly Lys Lys Asn 
405 410 415 

40 Gly Glu Pro Lys Gin Thr Ser Gly Lys Gin Glu Leu Tyr Glu Ala He 
420 425 430 



Val Ala Met Tyr Gin 
435 



<210> 2 
<211> 1669 
<212> DNA 
50 <213> Piromyces sp. 

<400> 2 

gtaaatggct aaggaatatt tcccacaaat tcaaaagatt aagttcgaag gtaaggattc 60 
taagaatcca ttagccttcc actactacga tgctgaaaag gaagtcatgg gtaagaaaat 120 

55 gaaggattgg ttacgtttcg ccatggcctg gtggcacact ctttgcgccg aaggtgctga 180 
ccaattcggt ggaggtacaa agtctttccc atggaacgaa ggtactgatg ctattgaaat 240 
tgccaagcaa aaggttgatg ctggtttcga aatcatgcaa aagcttggta ttccatacta 3 00 
ctgtttccac gatgttgatc ttgtttccga aggtaactct attgaagaat acgaatccaa 360 
ccttaaggct gtcgttgctt acctcaagga aaagcaaaag gaaaccggta ttaagcttct 420 

60 ctggagtact gctaacgtct tcggtcacaa gcgttacatg aacggtgcct ccactaaccc 480 
agactttgat gttgtcgccc gtgctattgt tcaaattaag aacgccatag acgccggtat 540 
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tgaacttggt gctgaaaact acgtcttctg 
taacactgac caaaagcgtg aaaaggaaca 
ctacgctcgt tccaagggat tcaagggtac 
aaccaagcac caatacgatg ttgacactga 
5 cttagacaag gacttcaagg tcaacattga 
tttcgaacac gaacttgcct gtgctgttga 
ccgtggtgac taccaaaacg gttgggatac 
cgtccaagct tggatggaaa tcatccgtgg 
cgatgccaag actcgtcgta actctactga 

10 tggtatggat gctatggctc gtgctcttga 
atacaccaag atgaagaagg aacgttacgc 
tgaagatggt aagctcaccc tcgaacaagt 
aaagcaaact tctggtaagc aagaactcta 
taatcgtagt taaattggta aaataattgt 

15 aagtttaaaa gatcctatct ctgtactaat 
aaaaaaaagt ctaaaagacg gaagaattta 
caatagataa atcctttata ttaggaaaat 
aaaagaaagt aaataaaaca caagaggaaa 
tttatgcaaa tataaatata aataaaataa 

20 



25 

gggtggtcgt gaaggttaca tgagtctcct 600 
catggccact atgcttacca tggctcgtga 660 
tttcctcatt gaaccaaagc caatggaacc 720 
aaccgctatt ggtttcctta aggcccacaa 780 
agttaaccac gctactcttg ctggtcacac 840 
tgctggtatg ctcggttcca ttgatgctaa 900 
tgatcaattc ccaattgatc aatacgaact 960 
tggtggtttc gttactggtg gtaccaactt 102 0 
cctcgaagac atcatcattg cccacgtttc 1080 
aaacgctgcc aagctcctcc aagaatctcc 1140 
ttccttcgac agtggtattg gtaaggactt 12 00 
ttacgaatac ggtaagaaga acggtgaacc 1260 
cgaagctatt gttgccatgt accaataagt 1320 
aaaatcaata aacttgtcaa tcctccaatc 1380 
taaatatagt acaaaaaaaa atgtataaac 1440 
atttagggaa aaaataaaaa taataataaa 1500 
gtcccattgt attattttca tttctactaa 1560 
ttttcccttt tttttttttt tgtaataaat 1620 
taaaaaaaaa aaaaaaaaa 1669 



<210> 3 
<211> 494 
<212> PRT 
25 <213> Piromyces sp. 



30 



35 



<400> 3 

Met Lys Thr Val Ala Gly He Asp Leu Gly Thr Gin Ser Met Lys Val 
15 10 15 

Val He Tyr Asp Tyr Glu Lys Lys Glu He He Glu Ser Ala Ser Cys 
20 25 30 

Pro Met Glu Leu He Ser Glu Ser Asp Gly Thr Arg Glu Gin Thr Thr 
35 40 45 

Glu Trp Phe Asp Lys Gly Leu Glu Val Cys Phe Gly Lys Leu Ser Ala 
50 55 60 



40 Asp Asn Lys Lys Thr He Glu Ala He Gly He Ser Gly Gin Leu His 
65 70 75 80 



45 



Gly Phe Val Pro Leu Asp Ala Asn Gly Lys Ala Leu Tyr Asn He Lys 
85 90 95 

Leu Trp Cys Asp Thr Ala Thr Val Glu Glu Cys Lys He He Thr Asp 
100 105 HO 



Ala Ala Gly Gly Asp Lys Ala Val He Asp Ala Leu Gly Asn Leu Met 
50 115 120 125 

Leu Thr Gly Phe Thr Ala Pro Lys He Leu Trp Leu Lys Arg Asn Lys 
130 135 140 

55 Pro Glu Ala Phe Ala Asn Leu Lys Tyr He Met Leu Pro His Asp Tyr 
145 150 155 160 



Leu Asn Trp Lys Leu Thr Gly Asp Tyr Val Met Glu Tyr Gly Asp Ala 
165 170 175 

60 

Ser Gly Thr Ala Leu Phe Asp Ser Lys Asn Arg Cys Trp Ser Lys Lys 
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180 



185 



190 



He Cys Asp He He Asp Pro Lys Leu Leu Asp Leu Leu Pro Lys Leu 
195 200 205 

5 

He Glu Pro Ser Ala Pro Ala Gly Lys Val Asn Asp Glu Ala Ala Lys 
210 215 220 

Ala Tyr Gly He Pro Ala Gly He Pro Val Ser Ala Gly Gly Gly Asp 
10 225 230 235 240 

Asn Met Met Gly Ala Val Gly Thr Gly Thr Val Ala Asp Gly Phe Leu 
245 250 255 

15 Thr Met Ser Met Gly Thr Ser Gly Thr Leu Tyr Gly Tyr Ser Asp Lys 
260 265 270 



20 



Pro He Ser Asp Pro Ala Asn Gly Leu Ser Gly Phe Cys Ser Ser Thr 

275 280 285 

Gly Gly Trp Leu Pro Leu Leu Cys Thr Met Asn Cys Thr Val Ala Thr 
290 295 300 



25 



Glu Phe Val Arg Asn Leu Phe Gin Met Asp He Lys Glu Leu Asn Val 
305 310 315 320 



Glu Ala Ala Lys Ser Pro Cys Gly Ser Glu Gly Val Leu Val He Pro 
325 330 335 

30 Phe Phe Asn Gly Glu Arg Thr Pro Asn Leu Pro Asn Gly Arg Ala Ser 
340 345 350 



35 



He Thr Gly Leu Thr Ser Ala Asn Thr Ser Arg Ala Asn He Ala Arg 
355 360 365 

Ala Ser Phe Glu Ser Ala Val Phe Ala Met Arg Gly Gly Leu Asp Ala 
370 375 380 



Phe Arg Lys Leu Gly Phe Gin Pro Lys Glu He Arg Leu He Gly Gly 
40 385 390 395 400 

Gly Ser Lys Ser Asp Leu Trp Arg Gin He Ala Ala Asp He Met Asn 
405 410 415 

45 Leu Pro He Arg Val Pro Leu Leu Glu Glu Ala Ala Ala Leu Gly Gly 
420 425 430 



50 



Ala Val Gin Ala Leu Trp Cys Leu Lys Asn Gin Ser Gly Lys Cys Asp 
435 440 445 

He Val Glu Leu Cys Lys Glu His He Lys He Asp Glu Ser Lys Asn 
450 455 460 



Ala Asn Pro He Ala Glu Asn Val Ala Val Tyr Asp Lys Ala Tyr Asp 
55 465 470 475 480 

Glu Tyr Cys Lys Val Val Asn Thr Leu Ser Pro Leu Tyr Ala 
485 490 



60 



<210> 4 
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<211> 2041 
<212> DNA 

<213> Piromyces sp. 

5 <400> 4 

attatataaa ataactttaa ataaaacaat 
aattaaagta aaagaaaaat aatacagtag 
gttgctggta ttgatcttgg aactcaaagt 
aaagaaatta ttgaaagtgc tagctgtcca 

10 cgtgaacaaa ccactgaatg gtttgacaag 
gctgataaca aaaagactat tgaagctatt 
cctcttgatg ctaacggtaa ggctttatac 
gttgaagaat gtaagattat cactgatgct 
cttggtaacc ttatgctcac cggtttcacc 

15 aagccagaag ctttcgctaa cttaaagtac 
aagcttactg gtgattacgt tatggaatac 
tctaagaacc gttgctggtc taagaagatt 
ttacttccaa agttaattga accaagcgct 
aaggcttacg gtattccagc cggtattcca 

20 ggtgctgttg gtactggtac tgttgctgat 
ggtactcttt acggttacag tgacaagcca 
ttctgttctt ctactggtgg atggcttcca 
actgaattcg ttcgtaacct cttccaaatg 
aagtctccat gtggtagtga aggtgtttta 

25 ccaaacttac caaacggtcg tgctagtatt 
gctaacattg ctcgtgctag tttcgaatcc 
gctttccgta agttaggttt ccaaccaaag 
tctgatctct ggagacaaat tgccgctgat 
ttagaagaag ctgctgctct tggtggtgct 

30 tctggtaagt gtgatattgt tgaactttgc 
aatgctaacc caattgccga aaatgttgct 
aaggttgtaa atactctttc tccattatat 
atgccatata attgccttgt caatacactg 
tttacaaggt ttatacaatt aatatctatt 

35 aagattagac gaaacaattc ttggttcctt 
aatagtctcg tatttatgcc caataatcag 
taaaaacaaa ataaataaat taaataaaca 
aagtaatata aaaaaaaagt aaataaataa 
taaataaata aataaaatat aaaaataatt 

40 a 
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ttttatttgt ttatttaatt attcaaaaaa 60 
aacaatagta ataatatcaa aatgaagact 120 
atgaaagtcg ttatttacga ctatgaaaag 180 
atggaattga tttccgaaag tgacggtacc 240 
ggtcttgaag tttgttttgg taagcttagt 300 
ggtatttctg gtcaattaca cggttttgtt 360 
aacatcaaac tttggtgtga tactgctacc 420 
gccggtggtg acaaggctgt tattgatgcc 480 
gctccaaaga tcctctggct caagcgcaac 540 
attatgcttc cacacgatta cttaaactgg 600 
ggtgatgcct ctggtaccgc tctcttcgat 660 
tgcgatatca ttgacccaaa acttttagat 720 
ccagctggta aggttaatga tgaagccgct 780 
gtttccgctg gtggtggtga taacatgatg 840 
ggtttcctta ccatgtctat gggtacttct 900 
attagtgacc cagctaatgg tttaagtggt 960 
ttactttgta ctatgaactg tactgttgcc 102 0 
gatattaagg aacttaatgt tgaagctgcc 1080 
gttattccat tcttcaatgg tgaaagaact 1140 
actggtctta cttctgctaa caccagccgt 1200 
gccgttttcg ctatgcgtgg tggtttagat 1260 
gaaattcgtc ttattggtgg tggttctaag 1320 
atcatgaacc ttccaatcag agttccactt 1380 
gttcaagctt tatggtgtct taagaaccaa 1440 
aaagaacaca ttaagattga tgaatctaag 1500 
gtttacgaca aggcttacga tgaatactgc 1560 
gcttaaattg ccaatgtaaa aaaaaatata 1620 
ttcatgttca tataatcata ggacattgaa 1680 
atcatattat tatacagcat ttcattttct 1740 
gcaatataca aaatttacat gaatttttag 1800 
gaaaattacc taatgctgga ttcttgttaa 1860 
aataaaaatt ataagtaaat ataaatatat 192 0 
ataaataaat aaaaattttt tgcaaatata 1980 
tagcaaataa attaaaaaaa aaaaaaaaaa 2040 

2041 



